Training Language Models Using Target-Propagation
نویسندگان
چکیده
While Truncated Back-Propagation through Time (BPTT) is the most popular approach to training Recurrent Neural Networks (RNNs), it suffers from being inherently sequential (making parallelization difficult) and from truncating gradient flow between distant time-steps. We investigate whether Target Propagation (TPROP) style approaches can address these shortcomings. Unfortunately, extensive experiments suggest that TPROP generally underperforms BPTT, and we end with an analysis of this phenomenon, and suggestions for future work.
منابع مشابه
Piecewise Training for Undirected Models
For many large undirected models that arise in real-world applications, exact maximumlikelihood training is intractable, because it requires computing marginal distributions of the model. Conditional training is even more difficult, because the partition function depends not only on the parameters, but also on the observed input, requiring repeated inference over each training example. An appea...
متن کاملAssessing the Performance of Corroding RC Bridge Decks: A Critical Review of Corrosion Propagation Models
Corrosion of steel reinforcement is one of the most prevalent causes of reinforced concrete (RC) structures deterioration in chloride-contaminated environments. As a result, evaluating the impact of any possible corrosion-induced damages to reinforced concrete bridges strongly affects management decisions: such as inspection, maintenance and repair actions. The corrosion propagation phase is a ...
متن کاملUsing Back-Propagation (BPN) neural networks for basic knowledge of the English language diagnosis
This article studies the expediency of using neural networks technology and the development of back-propagation networks (BPN) models for modeling automated evaluation of the answers and progress of deaf students’ that possess basic knowledge of the English language and computer skills, within a virtual e-learning environment. The performance of the developed neural models is evaluated with the...
متن کاملUnsupervised training of maximum-entropy models for lexical selection in rule-based machine translation
This article presents a method of training maximum-entropy models to perform lexical selection in a rule-based machine translation system. The training method described is unsupervised; that is, it does not require any annotated corpus. The method uses source-language monolingual corpora, the machine translation (MT) system in which the models are integrated, and a statistical target-language m...
متن کاملMAP-based cross-language adaptation augmented by linguistic knowledge: from English to Chinese
Construction of a recognizer in a new target language usually involves collection of a comprehensive database in that language as well as manual annotation and model training. For rapid development of new language recognizer, we propose (1) substituting target language phoneme models by source language phoneme models trained previously; and (2) adapting target language phoneme models from sourc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1702.04770 شماره
صفحات -
تاریخ انتشار 2017